PolarFormer: Multi-Camera 3D Object Detection with Polar Transformer
نویسندگان
چکیده
3D object detection in autonomous driving aims to reason “what” and “where” the objects of interest present a world. Following conventional wisdom previous 2D detection, existing methods often adopt canonical Cartesian coordinate system with perpendicular axis. However, we conjugate that this does not fit nature ego car’s perspective, as each onboard camera perceives world shape wedge intrinsic imaging geometry radical (non perpendicular) Hence, paper advocate exploitation Polar propose new Transformer (PolarFormer) for more accurate bird’s-eye-view (BEV) taking input only multi-camera images. Specifically, design cross-attention based head without restriction structure deal irregular grids. For tackling unconstrained scale variations along Polar’s distance dimension, further introduce multi-scale representation learning strategy. As result, our model can make best use rasterized via attending corresponding image observation sequence-to-sequence fashion subject geometric constraints. Thorough experiments on nuScenes dataset demonstrate PolarFormer outperforms significantly state-of-the-art alternatives.
منابع مشابه
Multi-camera Multi-Object Tracking
In this paper, we propose a pipeline for multi-target visual tracking under multi-camera system. For multi-camera system tracking problem, efficient data association across cameras, and at the same time, across frames becomes more important than single-camera system tracking. However, most of the multi-camera tracking algorithms emphasis on single camera across frame data association. Thus in o...
متن کاملA 3d Time of Flight Camera for Object Detection
The knowledge of three-dimensional data is essential for many control and navigation applications. Especially in the industrial and automotive environment a fast and reliable acquisition of 3D data has become a main requirement for future developments. Moreover low cost 3D imaging has the potential to open a wide field of additional applications and solutions in markets like consumer electronic...
متن کاملMulti-Camera Collision Detection allowing for Object Occlusions
A multi-camera-based collision detection system is presented. We describe the computation of global collision information for the entire surveilled workspace based on local collision information extracted from camera images. If there are known occlusions (e.g., by the robot), the system is able to recover object collision information by fusing multiple camera images. The algorithm presented is ...
متن کاملObject detection with single-camera stereo
Many fielded mobile robot systems have demonstrated the importance of directly estimating the 3D shape of objects in the robot’s vicinity. The most mature solutions available today use active laser scanning or stereo camera pairs, but both approaches require specialized and expensive sensors. In prior publications, we have demonstrated the generation of stereo images from a single very low-cost...
متن کامل3D Object Detection with Kinect
1. Abstract The goal of our project is to develop a general machine learning framework for classifying objects based on RGBD point cloud data from a Kinect. Using this framework, a robot equipped with a Kinect will take the name of an object as input, scan its surroundings, and move to the most likely matching object that it finds. As a proof of concept, we demonstrate our algorithm on an offic...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Proceedings of the ... AAAI Conference on Artificial Intelligence
سال: 2023
ISSN: ['2159-5399', '2374-3468']
DOI: https://doi.org/10.1609/aaai.v37i1.25185